Support vector machine classification for large data sets via minimum enclosing ball clustering
نویسندگان
چکیده
Support vector machine (SVM) is a powerful technique for data classification. Despite of its good theoretic foundations and high classification accuracy, normal SVM is not suitable for classification of large data sets, because the training complexity of SVM is highly dependent on the size of data set. This paper presents a novel SVM classification approach for large data sets by using minimum enclosing ball clustering. After the training data are partitioned by the proposed clustering method, the centers of the clusters are used for the first time SVM classification. Then we use the clusters whose centers are support vectors or those clusters which have different classes to perform the second time SVM classification. In this stage most data are removed. Several experimental results show that the approach proposed in this paper has good classification accuracy compared with classic SVM while the training is significantly faster than several other SVM classifiers. r 2007 Elsevier B.V. All rights reserved.
منابع مشابه
Multi-Class SVM for Large Data Sets Considering Models of Classes Distribution
Support Vector Machines (SVM) have gained profound interest amidst the researchers. One of the important issues concerning SVM is with its application to large data sets. It is recognized that SVM is computationally very intensive. This paper presents a novel multi SVM classification approach for large data sets using the sketch of classes distribution which is obtained by using SVM and minimum...
متن کاملStreamed Learning: One-Pass SVMs
We present a streaming model for large-scale classification (in the context of l2-SVM) by leveraging connections between learning and computational geometry. The streaming model imposes the constraint that only a single pass over the data is allowed. The l2-SVM is known to have an equivalent formulation in terms of the minimum enclosing ball (MEB) problem, and an efficient algorithm based on th...
متن کاملRobustified distance based fuzzy membership function for support vector machine classification
Fuzzification of support vector machine has been utilized to deal with outlier and noise problem. This importance is achieved, by the means of fuzzy membership function, which is generally built based on the distance of the points to the class centroid. The focus of this research is twofold. Firstly, by taking the advantage of robust statistics in the fuzzy SVM, more emphasis on reducing the im...
متن کاملOutlier Detection for Support Vector Machine using Minimum Covariance Determinant Estimator
The purpose of this paper is to identify the effective points on the performance of one of the important algorithm of data mining namely support vector machine. The final classification decision has been made based on the small portion of data called support vectors. So, existence of the atypical observations in the aforementioned points, will result in deviation from the correct decision. Thus...
متن کاملSingle Pass SVM using Minimum Enclosing Ball of Streaming Data
We present a stream algorithm for large scale classification (in the context of l2-SVM) by leveraging connections between learning and computational geometry. The stream model [1] imposes the constraint that only a single pass over the data is allowed. We study the streaming model for the problem of binary classification with SVMs and propose a single pass SVM algorithm based on the minimum enc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Neurocomputing
دوره 71 شماره
صفحات -
تاریخ انتشار 2008